308 research outputs found

    DASMIweb: online integration, analysis and assessment of distributed protein interaction data

    Get PDF
    In recent years, we have witnessed a substantial increase of the amount of available protein interaction data. However, most data are currently not readily accessible to the biologist at a single site, but scattered over multiple online repositories. Therefore, we have developed the DASMIweb server that affords the integration, analysis and qualitative assessment of distributed sources of interaction data in a dynamic fashion. Since DASMIweb allows for querying many different resources of protein and domain interactions simultaneously, it serves as an important starting point for interactome studies and assists the user in finding publicly accessible interaction data with minimal effort. The pool of queried resources is fully configurable and supports the inclusion of own interaction data or confidence scores. In particular, DASMIweb integrates confidence measures like functional similarity scores to assess individual interactions. The retrieved results can be exported in different file formats like MITAB or SIF. DASMIweb is freely available at http://www.dasmiweb.de

    Bridging topological and functional information in protein interaction networks by short loops profiling

    Get PDF
    Protein-protein interaction networks (PPINs) have been employed to identify potential novel interconnections between proteins as well as crucial cellular functions. In this study we identify fundamental principles of PPIN topologies by analysing network motifs of short loops, which are small cyclic interactions of between 3 and 6 proteins. We compared 30 PPINs with corresponding randomised null models and examined the occurrence of common biological functions in loops extracted from a cross-validated high-confidence dataset of 622 human protein complexes. We demonstrate that loops are an intrinsic feature of PPINs and that specific cell functions are predominantly performed by loops of different lengths. Topologically, we find that loops are strongly related to the accuracy of PPINs and define a core of interactions with high resilience. The identification of this core and the analysis of loop composition are promising tools to assess PPIN quality and to uncover possible biases from experimental detection methods. More than 96% of loops share at least one biological function, with enrichment of cellular functions related to mRNA metabolic processing and the cell cycle. Our analyses suggest that these motifs can be used in the design of targeted experiments for functional phenotype detection.This research was supported by the Biotechnology and Biological Sciences Research Council (BB/H018409/1 to AP, ACCC and FF, and BB/J016284/1 to NSBT) and by the Leukaemia & Lymphoma Research (to NSBT and FF). SSC is funded by a Leukaemia & Lymphoma Research Gordon Piller PhD Studentship

    Polycation-π Interactions Are a Driving Force for Molecular Recognition by an Intrinsically Disordered Oncoprotein Family

    Get PDF
    Molecular recognition by intrinsically disordered proteins (IDPs) commonly involves specific localized contacts and target-induced disorder to order transitions. However, some IDPs remain disordered in the bound state, a phenomenon coined "fuzziness", often characterized by IDP polyvalency, sequence-insensitivity and a dynamic ensemble of disordered bound-state conformations. Besides the above general features, specific biophysical models for fuzzy interactions are mostly lacking. The transcriptional activation domain of the Ewing's Sarcoma oncoprotein family (EAD) is an IDP that exhibits many features of fuzziness, with multiple EAD aromatic side chains driving molecular recognition. Considering the prevalent role of cation-π interactions at various protein-protein interfaces, we hypothesized that EAD-target binding involves polycation- π contacts between a disordered EAD and basic residues on the target. Herein we evaluated the polycation-π hypothesis via functional and theoretical interrogation of EAD variants. The experimental effects of a range of EAD sequence variations, including aromatic number, aromatic density and charge perturbations, all support the cation-π model. Moreover, the activity trends observed are well captured by a coarse-grained EAD chain model and a corresponding analytical model based on interaction between EAD aromatics and surface cations of a generic globular target. EAD-target binding, in the context of pathological Ewing's Sarcoma oncoproteins, is thus seen to be driven by a balance between EAD conformational entropy and favorable EAD-target cation-π contacts. Such a highly versatile mode of molecular recognition offers a general conceptual framework for promiscuous target recognition by polyvalent IDPs. © 2013 Song et al

    Accounting for Redundancy when Integrating Gene Interaction Databases

    Get PDF
    During the last years gene interaction networks are increasingly being used for the assessment and interpretation of biological measurements. Knowledge of the interaction partners of an unknown protein allows scientists to understand the complex relationships between genetic products, helps to reveal unknown biological functions and pathways, and get a more detailed picture of an organism's complexity. Being able to measure all protein interactions under all relevant conditions is virtually impossible. Hence, computational methods integrating different datasets for predicting gene interactions are needed. However, when integrating different sources one has to account for the fact that some parts of the information may be redundant, which may lead to an overestimation of the true likelihood of an interaction. Our method integrates information derived from three different databases (Bioverse, HiMAP and STRING) for predicting human gene interactions. A Bayesian approach was implemented in order to integrate the different data sources on a common quantitative scale. An important assumption of the Bayesian integration is independence of the input data (features). Our study shows that the conditional dependency cannot be ignored when combining gene interaction databases that rely on partially overlapping input data. In addition, we show how the correlation structure between the databases can be detected and we propose a linear model to correct for this bias. Benchmarking the results against two independent reference data sets shows that the integrated model outperforms the individual datasets. Our method provides an intuitive strategy for weighting the different features while accounting for their conditional dependencies

    Inferring the conservative causal core of gene regulatory networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically.</p> <p>Results</p> <p>In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from <it>E. coli </it>that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently.</p> <p>Conclusions</p> <p>For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.</p

    Inferring the conservative causal core of gene regulatory networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically.</p> <p>Results</p> <p>In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from <it>E. coli </it>that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently.</p> <p>Conclusions</p> <p>For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.</p

    An Activation Force-based Affinity Measure for Analyzing Complex Networks

    Get PDF
    Affinity measure is a key factor that determines the quality of the analysis of a complex network. Here, we introduce a type of statistics, activation forces, to weight the links of a complex network and thereby develop a desired affinity measure. We show that the approach is superior in facilitating the analysis through experiments on a large-scale word network and a protein-protein interaction (PPI) network consisting of ∼5,000 human proteins. The experiment on the word network verifies that the measured word affinities are highly consistent with human knowledge. Further, the experiment on the PPI network verifies the measure and presents a general method for the identification of functionally similar proteins based on PPIs. Most strikingly, we find an affinity network that compactly connects the cancer-associated proteins to each other, which may reveal novel information for cancer study; this includes likely protein interactions and key proteins in cancer-related signal transduction pathways

    Prior knowledge based mining functional modules from Yeast PPI networks with gene ontology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the literature, there are fruitful algorithmic approaches for identification functional modules in protein-protein interactions (PPI) networks. Because of accumulation of large-scale interaction data on multiple organisms and non-recording interaction data in the existing PPI database, it is still emergent to design novel computational techniques that can be able to correctly and scalably analyze interaction data sets. Indeed there are a number of large scale biological data sets providing indirect evidence for protein-protein interaction relationships.</p> <p>Results</p> <p>The main aim of this paper is to present a prior knowledge based mining strategy to identify functional modules from PPI networks with the aid of Gene Ontology. Higher similarity value in Gene Ontology means that two gene products are more functionally related to each other, so it is better to group such gene products into one functional module. We study (i) to encode the functional pairs into the existing PPI networks; and (ii) to use these functional pairs as pairwise constraints to supervise the existing functional module identification algorithms. Topology-based modularity metric and complex annotation in MIPs will be used to evaluate the identified functional modules by these two approaches.</p> <p>Conclusions</p> <p>The experimental results on Yeast PPI networks and GO have shown that the prior knowledge based learning methods perform better than the existing algorithms.</p

    The UCSC Genome Browser Database: 2008 update

    Get PDF
    The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrate and 21 invertebrate species as of September 2007. For each assembly, the GBD contains a collection of annotation data aligned to the genomic sequence. Highlights of this year's additions include a 28-species human-based vertebrate conservation annotation, an enhanced UCSC Genes set, and more human variation, MGC, and ENCODE data. The database is optimized for fast interactive performance with a set of web-based tools that may be used to view, manipulate, filter and download the annotation data. New toolset features include the Genome Graphs tool for displaying genome-wide data sets, session saving and sharing, better custom track management, expanded Genome Browser configuration options and a Genome Browser wiki site. The downloadable GBD data, the companion Genome Browser toolset and links to documentation and related information can be found at: http://genome.ucsc.edu/
    corecore